Analysing Finnish with word lists: the DDI approach to morphology revisited
نویسنده
چکیده
Morphological lexicons for morphologically complex languages provide good text coverage at the cost of overgeneration, difficulty of modification, and sometimes performance issues. Use of simple, manageable lexicon forms – especially lists – for morphologically complex languages may appear unviable because the number of possible word-forms in a morphologically complex language can be prohibitively high. We created and experimented with a list-based lexicon for a morphologically complex language (Finnish), and compared its coverage with that of a mature morphological analyser on new text in two experimental settings. e observed smallish difference in coverage suggests the viability of using simple and easy-to-modify list-based lexicons as an initial part of morphological analysis, to increase developer control on the vast majority of input tokens.
منابع مشابه
Do We Need Discipline-Specific Academic Word Lists? Linguistics Academic Word List (LAWL)
This corpus-based study aimed at exploring the most frequently-used academic words in linguistics and compare the wordlist with the distribution of high frequency words in Coxhead’s Academic Word List (AWL) and West’s General Service List (GSL) to examine their coverage within the linguistics corpus. To this end, a corpus of 700 linguistics research articles (LRAC), consisting of approximately ...
متن کاملMaterial Development and English for Academic Purposes Word Lists; a Reductionist Approach
Nagy (1988) states that vocabulary is a prerequisite factor in comprehension. Drawing upon a reductionist approach and having in mind the prospects for material development, this study aimed at creating an English for Academic Purposes Word List (EAPWL). The corpus of this study was compiled from a corpus containing 6479 pages of texts, 2,081,678 million tokens (running words) and 63825 types (...
متن کاملVocabulary Lists for EAP and Conversation Students
Despite the abundance of research investigating general and academic vocabularies and developing dozens of word lists, few studies have compared academic vocabulary with general service word lists such as conversation vocabulary. Many EAP researchers assume that university students need to know all the words in West’s (1953) General Service List (GSL) as a prerequisite to academic words (e.g., ...
متن کاملNeural dynamics of reading morphologically complex words
Despite considerable research interest, it is still an open issue as to how morphologically complex words such as "car+s" are represented and processed in the brain. We studied the neural correlates of the processing of inflected nouns in the morphologically rich Finnish language. Previous behavioral studies in Finnish have yielded a robust inflectional processing cost, i.e., inflected words ar...
متن کاملBertrand’s Paradox Revisited: More Lessons about that Ambiguous Word, Random
The Bertrand paradox question is: “Consider a unit-radius circle for which the length of a side of an inscribed equilateral triangle equals 3 . Determine the probability that the length of a ‘random’ chord of a unit-radius circle has length greater than 3 .” Bertrand derived three different ‘correct’ answers, the correctness depending on interpretation of the word, random. Here we employ geomet...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018